Disclaimer: This report has been written for the authors learning purposes only and uses open data from Public Health Scotland under the UK Open Government Licence (OGL)

Aim

To inform the planning and provision of cancer treatment services by analysing breast cancer incidence data reported by NHS Borders.

Introduction

Between 1997-2021, breast cancer had the third highest number of incidences of any cancer type reported by NHS Borders. In this period, breast cancer in males made up less than 1% of total breast cancer incidences and this report will therefor focus on incidences among females.

cancer_incidence_borders %>%
  filter(cancer_site != "All Cancer Types",
         sex == "All") %>%
  group_by(cancer_site) %>%
  summarise(total_incidences = sum(incidences_all_ages)) %>%
  arrange(desc(total_incidences)) %>%
  filter(total_incidences > 2000) %>% 
  gt() %>%
  tab_header(title = md("**Total Cancer Incidences by Cancer Site**"),
             subtitle = "NHS Borders (1997-2021): Sites w/ Over 2000 Total Incidences") %>% 
  cols_label(
    cancer_site = "Cancer Site",
    total_incidences = "Total Incidences") %>%
  tab_options(column_labels.font.weight = "bold",
              table.align = "left") %>% 
  gt_highlight_rows(rows = 3)
Total Cancer Incidences by Cancer Site
NHS Borders (1997-2021): Sites w/ Over 2000 Total Incidences
Cancer Site Total Incidences
Non-Melanoma Skin Cancer 6174
Basal Cell Carcinoma Of The Skin 4049
Breast 2614
Trachea, Bronchus And Lung 2534
Colorectal Cancer 2514
Squamous Cell Carcinoma Of The Skin 2075
cancer_incidence_borders %>%
  filter(cancer_site == "Breast",
         sex != "All") %>%
  group_by(sex) %>%
  summarise(total_incidences = sum(incidences_all_ages)) %>%
  arrange(desc(total_incidences)) %>%
  head(3) %>% 
  gt() %>%
  tab_header(title = md("**Breast Cancer Incidences by Sex**"),
             subtitle = "NHS Borders (1997-2021)") %>% 
  cols_label(
    sex = "Sex",
    total_incidences = "Total Incidences") %>%
  tab_options(column_labels.font.weight = 'bold',
              table.align = "left")
Breast Cancer Incidences by Sex
NHS Borders (1997-2021)
Sex Total Incidences
Female 2598
Male 16


According to NHS Borders data, breast cancer among females has the highest number of incidences and highest mean European age-standardised rate (EASR) of any cancer type.

cancer_incidence_borders %>%
  filter(cancer_site != "All Cancer Types",
         sex == "Female") %>%
  group_by(cancer_site) %>%
  summarise(total_incidences = sum(incidences_all_ages)) %>%
  arrange(desc(total_incidences)) %>%
  head(3) %>% 
  gt() %>%
  tab_header(title = md("**Female Cancer Incidences**"),
             subtitle = "NHS Borders (1997-2021)") %>% 
  cols_label(
    cancer_site = "Cancer Site",
    total_incidences = "Total Incidences") %>%
  tab_options(column_labels.font.weight = "bold") %>% 
  gt_highlight_rows(rows = 1)
Female Cancer Incidences
NHS Borders (1997-2021)
Cancer Site Total Incidences
Breast 2598
Non-Melanoma Skin Cancer 2519
Basal Cell Carcinoma Of The Skin 1882
cancer_incidence_borders %>%
  filter(cancer_site != "All Cancer Types",
         sex == "Female") %>%
  group_by(cancer_site) %>%
  summarise(mean_easr = mean(easr)) %>%
  arrange(desc(mean_easr)) %>%
  head(3) %>%
  gt() %>%
  tab_header(title = md("**Female EASR by Cancer Type**"),
             subtitle = "NHS Borders (1997-2021)") %>% 
  cols_label(
    cancer_site = "Cancer Site",
    mean_easr = "Mean EASR") %>%
  tab_options(column_labels.font.weight = "bold") %>% 
  gt_highlight_rows(rows = 1)
Female EASR by Cancer Type
NHS Borders (1997-2021)
Cancer Site Mean EASR
Breast 161.3640
Non-Melanoma Skin Cancer 150.3996
Basal Cell Carcinoma Of The Skin 113.9178

Health Board Comparison

To understand how these rates compare to other health boards in Scotland, we can visualise the EASR over a five year period. The EASR is the European age-standardised incidence rate per 100,000 person-years at risk.

geo_summary %>% 
  ggplot(aes(fill = easr)) + 
  geom_sf(colour = "white", linewidth = 0.04) +
  labs(
    title = "Female Breast Cancer EASR (2017-2021)",
    subtitle = "By NHS Health Board",
    fill = "EASR") +
  scale_fill_distiller(palette = "Blues", direction = +1) +
  theme(plot.title = element_text(size = 15, face = "bold"),
        plot.subtitle = element_text(size = 10),
        legend.title = element_text(face = "bold"),
        panel.background = element_rect(fill = "white"),
        axis.text.x = element_blank(),
        axis.text.y = element_blank(),
        axis.ticks = element_blank(),
        rect = element_blank(),
        axis.title.x = element_blank(),
        axis.title.y = element_blank())

NB - Unfortunately data for the individual health boards NHS Western Isles, NHS Shetland and NHS Orkney was not available at the time of report completion.

five_year_summary %>%
  select(hb, cancer_site, sex, year, easr) %>%
  filter(sex == "Female",
         cancer_site == "Breast",
         hb != "GR0800001") %>%
  left_join(geography_codes, "hb") %>% 
  select(hb_name, easr) %>% 
  arrange(desc(easr)) %>% 
  gt() %>%
  tab_header(title = md("**Female Breast Cancer EASR (2017-2021)**")) %>% 
  cols_label(
    hb_name = "Health Board",
    easr = "EASR") %>% 
  tab_options(column_labels.font.weight = 'bold') %>%
  data_color(columns = easr, palette = "Blues")
Female Breast Cancer EASR (2017-2021)
Health Board EASR
NHS Dumfries and Galloway 174.6153
NHS Lothian 172.3179
NHS Forth Valley 171.6585
NHS Lanarkshire 169.1486
NHS Greater Glasgow and Clyde 168.8007
NHS Borders 164.8136
NHS Fife 164.4207
NHS Tayside 163.4222
NHS Highland 162.5039
NHS Ayrshire and Arran 157.0019
NHS Grampian 156.2987

Hypothesis Test

Question: Is the greater number of female breast cancer incidences in “peak years” (1999, 2002, 2005, 2008, 2011, 2014, 2017) compared to “non-peak years” (1997, 1998, 2000, 2001, 2003, 2006, 2007, 2009, 2010, 2012, 2013, 2015, 2016, 2018, 2019) statistically significant?

cancer_incidence_borders_sample <- cancer_incidence_borders %>%
  filter(sex == "Female", cancer_site == "Breast") %>%
  select(id, cancer_site, sex, year, incidences_all_ages) %>% 
    mutate(peak = case_when(
    year == 1999 ~ "peak",
    year == 2002 ~ "peak",
    year == 2005 ~ "peak",
    year == 2008 ~ "peak",
    year == 2011 ~ "peak",
    year == 2014 ~ "peak",
    year == 2017 ~ "peak",
    TRUE          ~ "standard"
    )
  )

observed_stat <- cancer_incidence_borders_sample %>% 
  specify(incidences_all_ages ~ peak) %>%
  calculate(stat = "diff in means", order = c("peak", "standard"))

null_distribution <- cancer_incidence_borders_sample %>% 
  specify(response = incidences_all_ages, explanatory = peak) %>%
  hypothesize(null = "independence") %>%
  generate(reps = 1000, type = "permute") %>% 
  calculate(stat = "diff in means", order = c("peak", "standard"))

p_value <- null_distribution %>%
  get_p_value(obs_stat = observed_stat, direction = "right")

Test Used: Two Sample Mean Test (Independent)
Significance Level: 0.05

H0: \(\mu{1}\) - \(\mu{2}\) = 0
H1: \(\mu{1}\) - \(\mu{2}\) > 0

Result: Based on a bootstrapped NULL distribution, a very low p-value which is less than 0.05 is returned. We therefor reject H0 in favor of H1 with evidence suggesting that there is a statistically significant increase in the mean number of female breast cancer incidences in “peak years”.


Incidences by Age

fig2_plot <- five_year_summary_long %>%
  filter(cancer_site == "Breast",
         sex == "Female") %>%
  ggplot() +
  geom_col(aes(x = age, y = incidences, 
               text = paste0("<b>Age:</b> ", age, "<br>", "<b>Incidences:</b> ", incidences, "<br>")),
           fill = "#0391BF") +
  theme(axis.text.x = element_text(angle = 45, vjust = 0.5)) +
  labs(
    x = "\n Age",
    y = "Incidences\n",
    title = "Total Female Breast Cancer Incidences by Age") +
  theme(panel.background = element_rect(fill = "white"),
        panel.grid = element_line(colour = "grey90"))

ggplotly(fig2_plot, tooltip = "text") %>%
  layout(title = list(text = paste0("<b>Total Female Breast Cancer Incidences by Age</b>",
                                    "<br>",
                                    "<sup>",
                                    "NHS Borders: 1997-2021",
                                    "</sup>")))

What does this visualisation tell us?

  • The majority of breast cancer incidences in females appear to be between those aged between 50 and 79.

Why might these age groups see increased incidence numbers?

  • Currently only women between the ages of 50 and 70 are routinely screened (NHS National Services Scotland, 2022).

NHS Borders Population Projections:

Females 50+ 2021: 29889

Females 50+ 2041: 31148 (4.21225% increase)

(National Records of Scotland, 2023)


Conclusions / Recommendations

  • Screening data should be reviewed to establish if the resulting back-log from COVID-19 has been cleared in order to establish whether a further increase in incidences should be anticipated in 2022.

  • Resources should be allocated according to the observed trend of increased incidences every three years

  • Research/Analysis should be conducted to further understand and confirm any reason for this trend, including any links to screening schedules.

  • Research/Analysis should be conducted to establish whether increased incidence with age is in any way the result of current screening criteria and if therefor screening criteria should be widened.

  • Long term service planning should take into consideration the ~4% projected population increase of the female 50-70 demographic in NHS Borders, as projected by the National Records of Scotland.


Data Sources